Towards Automatic Software Lineage Inference
ثبت نشده
چکیده
Software continuously evolves to reflect changing requirements, feature updates, and bug fixes. Most existing research focuses on analyzing software release histories to understand the software evolution process and to describe evolutionary relationships among programs. However, there has been little research on inferring software lineage from (binary) programs. In this paper, we take a systematic approach towards software lineage inference. We explore three fundamental questions not addressed by existing work. First, how do we measure the quality of a lineage inference algorithm? Second, given existing approaches to binary similarity analysis, how good are they for lineage both currently and in an idealized setting? Third, what are the challenging problems in software lineage inference? Towards these goals we build LIMetric—a system for automatic software lineage inference of binary programs. We evaluated LIMetric on two types of lineage—straight line lineage and directed acyclic graph (DAG) lineage. We have also extended our technique to handle multiple straight line lineages. Our experiments used large scale real-world programs, with a total of 1,777 releases spanning over a combined 110 years of development history. In order to quantify lineage quality, we propose four metrics: (i) number of inversions and (ii) edit distance to monotonicity for straight line lineage, and (iii) number of lowest common ancestor (LCA) mismatches and (iv) average pairwise distance to true LCA for DAG lineage. LIMetric effectively extracted software derivation relationships among binary programs with high accuracy. Through close case analysis, we also formulate several challenging problems in software lineage inference that need to be addressed to attain even higher accuracy. Keywords-software evolution, software lineage, systematic evaluation
منابع مشابه
Towards Automatic Software Lineage Inference
Software lineage refers to the evolutionary relationship among a collection of software. The goal of software lineage inference is to recover the lineage given a set of program binaries. Software lineage can provide extremely useful information in many security scenarios such as malware triage and software vulnerability tracking. In this paper, we systematically study software lineage inference...
متن کاملApplications of Abduction: a Uniied Framework for Software and Knowledge Engineering
A new framework is proposed in which software engineering (SE) is the construction of a search space and knowledge engineering (KE) is the constructing the intelligence to control the traversal of that space. Conventional information systems and object-oriented notations can specify the search space. An abductive inference engine can 5 implement the intelligent control. This uniied framework su...
متن کاملOn Principles of Software Engineering - Role of the Inductive Inference
This paper highlights the role of the inductive inference principle in software engineering. It takes the challenge to settle differences and to confront the ideas behind the usual software engineering concepts. We focus on the inductive inference mechanism’s role behind the automatic program construction activities and software evolution. We believe that the revision of rather ln old ideas in ...
متن کاملBayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary...
متن کاملAutomatic Inference of Interface Properties from Program Source Code
Our research proposes a novel framework to automatically infer system-specific interface properties from program source code using static model-checking traces. Area: Software Engineering, sub-area: Software Verification
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012